Least Squares SVM for Least Squares TD Learning
نویسندگان
چکیده
We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible sequential nature) of training data arising in reinforcement learning we employ a subspace based variant of LS-SVM that sequentially processes the data and is hence especially suited for online learning. This approach is adapted from the context of Gaussian process regression and turns the unwieldy original optimization problem (with computational complexity being cubic in the number of processed data) into a reduced problem (with computional complexity being linear in the number of processed data). We introduce a QR decomposition based approach to solve the resulting generalized normal equations incrementally that is numerically more stable than existing recursive least squares based update algorithms. We also allow a forgetting factor in the updates to track non-stationary target functions (i.e. for the use with optimistic policy iteration). Experimental comparison with standard CMAC function approximation indicate that LS-SVMs are well-suited for online RL.
منابع مشابه
Least Squares Support Vector Machine for Constitutive Modeling of Clay
Constitutive modeling of clay is an important research in geotechnical engineering. It is difficult to use precise mathematical expressions to approximate stress-strain relationship of clay. Artificial neural network (ANN) and support vector machine (SVM) have been successfully used in constitutive modeling of clay. However, generalization ability of ANN has some limitations, and application of...
متن کاملDetermination of 137Ba Isotope Abundances in Water Samples by Inductively Coupled Plasma-optical Emission Spectrometry Combined with Least-squares Support Vector Machine Regression
A simple and rapid method for the determination of 137Ba isotope abundances in water samples by inductively coupled plasma-optical emission spectrometry (ICP-OES) coupled with least-squares support vector machine regression (LS-SVM) is reported. By evaluation of emission lines of barium, it was found that the emission line at 493.408 nm provides the best results for the determination...
متن کاملLeast-squares support vector machine and its application in the simultaneous quantitative spectrophotometric determination of pharmaceutical ternary mixture
This paper proposes the least-squares support vector machine (LS-SVM) as an intelligent method applied on absorption spectra for the simultaneous determination of paracetamol (PCT), caffeine (CAF) and ibuprofen (IB) in Novafen. The signal to noise ratio (S/N) increased. Also, In the LS - SVM model, Kernel parameter (σ2) and capacity factor (C) were optimized. Excellent prediction was shown usin...
متن کاملKernel Least-Squares Temporal Difference Learning
Kernel methods have attracted many research interests recently since by utilizing Mercer kernels, non-linear and non-parametric versions of conventional supervised or unsupervised learning algorithms can be implemented and usually better generalization abilities can be obtained. However, kernel methods in reinforcement learning have not been popularly studied in the literature. In this paper, w...
متن کاملIncremental Least-Squares Temporal Difference Learning
Approximate policy evaluation with linear function approximation is a commonly arising problem in reinforcement learning, usually solved using temporal difference (TD) algorithms. In this paper we introduce a new variant of linear TD learning, called incremental least-squares TD learning, or iLSTD. This method is more data efficient than conventional TD algorithms such as TD(0) and is more comp...
متن کامل